Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce Quickstart Pipeline image size #347

Merged
merged 4 commits into from
Jul 31, 2024

Conversation

aidandunlop
Copy link
Contributor

@aidandunlop aidandunlop commented Jul 24, 2024

Closes #346
The quickstart pipeline uses the TFX image as it's base, which is ~9GB and includes several extra tools which aren't needed by the quickstart pipeline. This PR uses a slim python image as the base instead, and installs the tfx python package and other dependencies directly which will reduce the image size dramatically.

Image size when using base TFX image:

  • 20 GB when built locally
  • 9GB (virtual size)

Image size after using tfx python package on slim python:

  • 2.46GB (when built locally)
  • 784MB (virtual size)

We encountered a recent breaking change in the setuptools dependency indirectly imported by TFX which was causing CsvExampleGen to fail (see main card). This was fixed by pinning the dependency version.

Tasks

  • Validate the pipeline runs successfully
  • Documentation updated

@adriangay adriangay marked this pull request as ready for review July 26, 2024 08:26
@adriangay adriangay requested review from a team and jmendesky and removed request for a team July 30, 2024 06:43
@aidandunlop
Copy link
Contributor Author

OK to test

@aidandunlop aidandunlop merged commit a3bb6d7 into master Jul 31, 2024
2 checks passed
@aidandunlop aidandunlop deleted the reduce-quickstart-image-size branch July 31, 2024 08:56
grahamia added a commit that referenced this pull request Aug 2, 2024
grahamia added a commit that referenced this pull request Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce Quickstart Pipeline image size
5 participants